AITopics | multi-source knowledge

Collaborating Authors

multi-source knowledge

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

TraceCoder: Towards Traceable ICD Coding via Multi-Source Knowledge Integration

Ren, Mucheng, Chen, He, Yan, Yuchen, Hu, Danqing, Xu, Jun, Zeng, Xian

arXiv.org Artificial IntelligenceNov-12-2025

Automated International Classification of Diseases (ICD) coding assigns standardized diagnosis and procedure codes to clinical records, playing a critical role in healthcare systems. However, existing methods face challenges such as semantic gaps between clinical text and ICD codes, poor performance on rare and long-tail codes, and limited interpretability. To address these issues, we propose TraceCoder, a novel framework integrating multi-source external knowledge to enhance traceability and explainability in ICD coding. TraceCoder dynamically incorporates diverse knowledge sources, including UMLS, Wikipedia, and large language models (LLMs), to enrich code representations, bridge semantic gaps, and handle rare and ambiguous codes. It also introduces a hybrid attention mechanism to model interactions among labels, clinical context, and knowledge, improving long-tail code recognition and making predictions interpretable by grounding them in external evidence. Experiments on MIMIC-III-ICD9, MIMIC-IV-ICD9, and MIMIC-IV-ICD10 datasets demonstrate that TraceCoder achieves state-of-the-art performance, with ablation studies validating the effectiveness of its components. TraceCoder offers a scalable and robust solution for automated ICD coding, aligning with clinical needs for accuracy, interpretability, and reliability.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.15267

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Health Care Providers & Services (0.72)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.69)
Health & Medicine > Health Care Technology > Medical Record (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

CardRewriter: Leveraging Knowledge Cards for Long-Tail Query Rewriting on Short-Video Platforms

Gong, Peiyuan, Zhu, Feiran, Yin, Yaqi, Dai, Chenglei, Zhang, Chao, Zheng, Kai, Bao, Wentian, Mao, Jiaxin, Zhang, Yi

arXiv.org Artificial IntelligenceOct-14-2025

Short-video platforms have rapidly become a new generation of information retrieval systems, where users formulate queries to access desired videos. However, user queries, especially long-tail ones, often suffer from spelling errors, incomplete phrasing, and ambiguous intent, resulting in mismatches between user expectations and retrieved results. While large language models (LLMs) have shown success in long-tail query rewriting within e-commerce, they struggle on short-video platforms, where proprietary content such as short videos, live streams, micro dramas, and user social networks falls outside their training distribution. To address this challenge, we introduce \textbf{CardRewriter}, an LLM-based framework that incorporates domain-specific knowledge to enhance long-tail query rewriting. For each query, our method aggregates multi-source knowledge relevant to the query and summarizes it into an informative and query-relevant knowledge card. This card then guides the LLM to better capture user intent and produce more effective query rewrites. We optimize CardRewriter using a two-stage training pipeline: supervised fine-tuning followed by group relative policy optimization, with a tailored reward system balancing query relevance and retrieval effectiveness. Offline experiments show that CardRewriter substantially improves rewriting quality for queries targeting proprietary content. Online A/B testing further confirms significant gains in long-view rate (LVR) and click-through rate (CTR), along with a notable reduction in initiative query reformulation rate (IQRR). Since September 2025, CardRewriter has been deployed on Kuaishou, one of China's largest short-video platforms, serving hundreds of millions of users daily.

information retrieval, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.10095

Country: Asia > China (0.49)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Services > e-Commerce Services (0.35)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

Add feedback

Synergistic Fusion of Multi-Source Knowledge via Evidence Theory for High-Entropy Alloy Discovery

Ha, Minh-Quyet, Le, Dinh-Khiet, Dao, Duc-Anh, Vu, Tien-Sinh, Nguyen, Duong-Nguyen, Nguyen, Viet-Cuong, Kino, Hiori, Huynh, Van-Nam, Dam, Hieu-Chi

arXiv.org Artificial IntelligenceFeb-20-2025

Discovering novel high-entropy alloys (HEAs) with desirable properties is challenging due to the vast compositional space and complex phase formation mechanisms. Efficient exploration of this space requires a strategic approach that integrates heterogeneous knowledge sources. Here, we propose a framework that systematically combines knowledge extracted from computational material datasets with domain knowledge distilled from scientific literature using large language models (LLMs). A central feature of this approach is the explicit consideration of element substitutability, identifying chemically similar elements that can be interchanged to potentially stabilize desired HEAs. Dempster-Shafer theory, a mathematical framework for reasoning under uncertainty, is employed to model and combine substitutabilities based on aggregated evidence from multiple sources. The framework predicts the phase stability of candidate HEA compositions and is systematically evaluated on both quaternary alloy systems, demonstrating superior performance compared to baseline machine learning models and methods reliant on single-source evidence in cross-validation experiments. By leveraging multi-source knowledge, the framework retains robust predictive power even when key elements are absent from the training data, underscoring its potential for knowledge transfer and extrapolation. Furthermore, the enhanced interpretability of the methodology offers insights into the fundamental factors governing HEA formation. Overall, this work provides a promising strategy for accelerating HEA discovery by integrating computational and textual knowledge sources, enabling efficient exploration of vast compositional spaces with improved generalization and interpretability.

high-entropy alloy discovery, multi-source knowledge, synergistic fusion, (1 more...)

arXiv.org Artificial Intelligence

2502.14631

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)

Add feedback

Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech

He, Shuwei, Liu, Rui

arXiv.org Artificial IntelligenceDec-23-2024

Visual Text-to-Speech (VTTS) aims to take the environmental image as the prompt to synthesize reverberant speech for the spoken content. Previous works focus on the RGB modality for global environmental modeling, overlooking the potential of multi-source spatial knowledge like depth, speaker position, and environmental semantics. To address these issues, we propose a novel multi-source spatial knowledge understanding scheme for immersive VTTS, termed MS2KU-VTTS. Specifically, we first prioritize RGB image as the dominant source and consider depth image, speaker position knowledge from object detection, and Gemini-generated semantic captions as supplementary sources. Afterwards, we propose a serial interaction mechanism to effectively integrate both dominant and supplementary sources. The resulting multi-source knowledge is dynamically integrated based on the respective contributions of each source.This enriched interaction and integration of multi-source spatial knowledge guides the speech generation model, enhancing the immersive speech experience. Experimental results demonstrate that the MS$^2$KU-VTTS surpasses existing baselines in generating immersive speech. Demos and code are available at: https://github.com/AI-S2-Lab/MS2KU-VTTS.

artificial intelligence, knowledge, spatial reasoning, (18 more...)

arXiv.org Artificial Intelligence

2410.14101

Country:

Asia > Mongolia (0.05)
Asia > China > Inner Mongolia > Hohhot (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)

Add feedback

Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling

Liu, Rui, Hu, Yifan, Ren, Yi, Yin, Xiang, Li, Haizhou

arXiv.org Artificial IntelligenceDec-19-2023

Conversational Speech Synthesis (CSS) aims to accurately express an utterance with the appropriate prosody and emotional inflection within a conversational setting. While recognising the significance of CSS task, the prior studies have not thoroughly investigated the emotional expressiveness problems due to the scarcity of emotional conversational datasets and the difficulty of stateful emotion modeling. In this paper, we propose a novel emotional CSS model, termed ECSS, that includes two main components: 1) to enhance emotion understanding, we introduce a heterogeneous graph-based emotional context modeling mechanism, which takes the multi-source dialogue history as input to model the dialogue context and learn the emotion cues from the context; 2) to achieve emotion rendering, we employ a contrastive learning-based emotion renderer module to infer the accurate emotion style for the target utterance. To address the issue of data scarcity, we meticulously create emotional labels in terms of category and intensity, and annotate additional emotional information on the existing conversational dataset (DailyTalk). Both objective and subjective evaluations suggest that our model outperforms the baseline models in understanding and rendering emotions. These evaluations also underscore the importance of comprehensive emotional annotations. Code and audio samples can be found at: https://github.com/walker-hyf/ECSS.

emotion, node, utterance, (16 more...)

arXiv.org Artificial Intelligence

2312.11947

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Hong Kong (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(4 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.88)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.68)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)
(2 more...)

Add feedback